You Can't Beat Frequency (Unless You Use Linguistic Knowledge) - A Qualitative Evaluation of Association Measures for Collocation and Term Extraction

نویسندگان

  • Joachim Wermter
  • Udo Hahn
چکیده

In the past years, a number of lexical association measures have been studied to help extract new scientific terminology or general-language collocations. The implicit assumption of this research was that newly designed term measures involving more sophisticated statistical criteria would outperform simple counts of cooccurrence frequencies. We here explicitly test this assumption. By way of four qualitative criteria, we show that purely statistics-based measures reveal virtually no difference compared with frequency of occurrence counts, while linguistically more informed metrics do reveal such a marked difference.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Presenting a Hybrid Approach based on Two-stage Data Envelopment Analysis to Evaluating Organization Productivity

   Measuring the performance of a production system has been an important task in management for purposes of control, planning, etc. Lord Kelvin said :“When you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind.” Hence, manag...

متن کامل

The Domain of the semantics of ‘promise’ in the Holy Quran

Semantics is a part of linguistic by which it can be analyzed the meaning of the words and sentences of a text and identified the part of speech with regard to semantics. This is a descriptive-analytic research and it deals with studying the meaning of ‘promise’ in the Holy Quran based on principles of semantics with a collocation approach by library methodology. Also, by virtue of ...

متن کامل

An Exploratory Study on the Use of 'I Love You' in the American Context

This study explores the use of the English locution I love you in the American context. The data were collected through a focus discussion group and a survey questionnaire. 120 college undergraduate students from a large public American university participated in the study with 28 attending the focus discussion group and 92 completing the survey questionnaire. The findings indicated th...

متن کامل

TCtract-A Collocation Extraction Approach for Noun Phrases Using Shallow Parsing Rules and Statistic Models

This paper presents a hybrid method for extracting Chinese noun phrase collocations that combines a statistical model with rule-based linguistic knowledge. The algorithm first extracts all the noun phrase collocations from a shallow parsed corpus by using syntactic knowledge in the form of phrase rules. It then removes pseudo collocations by using a set of statistic-based association measures (...

متن کامل

نشانه‌شناسی غزلی از مولانا

Mowlanā Jalāl al-Din muhammad Rūmi(604- 672 h) is one of the iranian poets and Sufis. For those who don’t know Mowlavi well understanding the concepts in his gazals is a demanding task. The existence of mystical expressions, his broad knowledge of various Islamic sciences, and cultural and literal traditions can be regarded as causes of this. In spite of the growing trend of researching o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006